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Abstract 


Two  distinct  approaches,  one  based  on  item-response  theory  and  the 
other  based  on  observed  item  responses  and  standard  summary  statistics, 
have  been  proposed  to  identify  unusual  response  patterns.  A  link 
between  these  two  approaches  is  provided  by  showing  certain 
correspondences  between  Sato's  S-P  curve  Theory  and  item  response 
theory.  This  link  makes  possible  several  extensions  of  Sato's  caution 
index  that  take  advantage  of  the  results  of  item  response  theory. 

Several  such  indices  are  introduced  and  their  use  illustrated  by 
application  to  a  set  of  achievement  test  data.  Two  of  the  newly 
introduced  extended  indices  were  found  to  be  very  effective  for  purposes 
of  Identifying  persons  who  consistently  use  an  erroneous  rule  in 
attempting  to  solve  signed-number  arithmetic  problems.  The  potential 
Importance  of  this  result  is  briefly  discussed. 
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Introduction 


Several  authors  have  recently  shown  an  interest  in  using 
Information  from  patterns  of  response  to  test  items  to  extract 
information  not  contained  in  the  total  score.  A  variety  of  purposes  have 
been  envisioned  for  use  of  the  additional  information.  Wright  (1977), 
for  example,  refers  to  identification  of  ''guessing,  sleeping,  fumbling, 
and  plodding"  (p#  110)  from  the  plots  of  residual  item  scores  based  on 
the  differences  between  item  responses  and  the  expected  responses  for  an 
individual  based  on  the  Rasch  model.  Levine  and  Rubin  (1979)  discuss 
response  patterns  that  are  "so  atypical ...  that  his  or  her  aptitude  test 
score  falls  to  be  a  completely  appropriate  measure"  (p.  269).  Sato 
(1975)  proposed  a  "caution"  index  which  is  Intended  to  Identify  students 
whose  total  scores  on  a  test  must  be  treated  with  caution.  Tatsuoka  and 
Tatsuoka  (1980)  and  Harnlsch  and  Linn  (1981)  have  discussed  the 
relationship  of  response  patterns  to  instructional  experiences  and  the 
possible  use  of  item  response  pattern  information  to  help  diagnose  the 
types  of  errors  a  student  is  making. 

Indices  of  the  degree  to  which  an  Individual's  pattern  of  responses 
is  unusual  are  conveniently  classified  into  two  general  types:  those 
that  use  item  response  theory  (IRT)  to  identify  unusual  patterns  and 
those  that  rely  only  on  observed  item  responses  and  standard  sianiary 
statistics  based  on  those  responses  (e.g.  the  number  or  proportion  of 
people  in  a  norm  group  answering  an  item  correctly).  The  work  of  Wright 
(1979)  and  of  Levine  and  Rubin  (1979)  are  examples  of  approaches  based 
on  IRT  while  the  work  of  Sato  (1975),  Tatsuoka  and  Tatsuoka  (1980),  and 
Harnlsch  and  Linn  (1981)  are  of  the  latter  type. 
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The  primary  purpose  of  this  paper  Is  to  develop  a  link  between 
these  two  general  approaches.  More  specifically,  we  will  show  a 
correspondence  between  Sato's  (1975)  S-P  Curve  theory  and  test  response 
curves  and  "group  response  curves"  developed  from  IRT.  Also,  Sato's 
Caution  Index  defined  in  the  S-P  curve  theory  is  generalized  into  a 
continuous  domain  utilizing  IRT.  That  is,  S-P  curve  theory  and  the 
Caution  Index  are  originally  developed  in  a  discrete  domain  of  0  -  1 
scoring,  but  this  study  extends  the  theory  to  a  more  general  case  of 
probabilities. 

Several  different  generalized  versions  of  the  caution  index  are 
presented.  Results  of  applying  these  Indices  suggest  that  there  are  two 
categories.  One  set  of  indices  functions  in  a  manner  similar  to  Sato's 
original  index.  The  other  set  functions  more  like  Tatsuoka  and 
Tatsuoka's  Individual  Consistency  Index  in  that  it  successfully 
distinguishes  examinees  who  make  consistent  errors  in  responding  to  test 
i terns. 

We  first  briefly  review  Sato's  S-P  Curve  theory.  Next,  a  group 
response  curve  (GRC)  is  developed  for  the  one  parameter  logistic  model. 
The  QIC  is  based  on  the  duallstlc  nature  of  the  one  parameter  logistic 
model  which  depends  on  the  choice  of  fixed  and  random  parameters  in  the 
model.  We  then  present  an  extended  caution  index  with  several  special 
areas  idiich  are  applicable  to  IRT.  The  cases  of  two  and  three  parameter 
logistic  models  are  briefly  discussed  with  special  attention  given  to 
problems  with  person  and  group  response  curves  in  these  models. 

Finally,  we  discuss  applications  of  the  new  caution  indices  for  the 
detection  of  anomalous  response  patterns. 
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S-P  Curve  Theory 

Sato's  (1975)  caution  Index  Is  applicable  to  either  an  Item  or  an 
individual  examinee.  In  either  form  the  index  is  conveniently  obtained 
from  an  especially  arranged  table  of  binary  item  scores  referred  to  as 
an  "S-P  Table”.  The  S-P  table,  the  associated  S-P  curves  and  various 
indices  as  the  caution  index  are  widely  used  in  Japan  for  diagnosing 
student  performance,  detecting  aberrant  response  patterns  and  for 
assessing  the  quality  of  a  test  or  instructional  sequence. 

The  S-P  table  is  a  data  matrix  in  which  the  students  (represented 
by  rows)  have  been  arranged  in  descending  order  of  their  total  test 
scores  from  top  to  bottom  and  the  items  (represented  by  columns)  have 
been  arranged  in  ascending  order  of  difficulty  from  left  to  right.  A 
hypothetical  S-P  table  is  shown  in  Table  1%  The  solid  stair-step  line 
is  called  an  S-curve  which  is  short  for  Student  curve.  For  each  person, 
represented  by  a  given  row,  a  vertical  line  is  drawn  to  the  right  of  the 
nth  cell  from  the  left  where  n  is  the  number  of  correct  answers  obtained 
by  that  person.  The  S-curve  is  then  obtained  by  connecting  the  right 
edge  of  the  nth  cell  of  each  row.  The  P-curve  is  drawn  in  an  analogous 
fashion  by  counting  down  from  the  top  the  number  of  cells  equal  to  the 
number  of  students  who  correctly  answered  the  item  corresponding  to  a 
given  column.  The  P-curve  for  the  data  in  Table  1  is  shown  by  the 
dashed  line. 

Insert  Table  1  about  here 

Let  yjj  be  the  binary  response  for  student  (row)  i  to  item  (column) 
j  of  the  S-P  table.  Row  and  column  sums  are  denoted  by  y^.  and  y.j 
respectively.  The  total  amber  of  ones  in  the  S-P  table  is  denoted 


sitem  j 


Table  1 

A  Hypothetical  Score  Matrix  (y^j)  and 
S—  (solid  line)  and  P-  (dotted  line)  Curves 
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and  the  proportion  of  correct  responses  by  Pi.,  P.j  and  P..  for  the  row, 
column  and  entire  table  respectively*  As  can  be  seen  in  Table  1,  the  S- 
curve  is  the  step  function  ogive  of  the  mulatlve  distribution 
function  of  total  scores,  yi#,  for  the  15  students  and  the  P-curve  is 
the  corresponding  function  of  y.j,  the  number  of  right  answers  for  the  10 
items* 

Insert  Table  2  about  here 

If  the  S-curve  is  held  invariant  and  all  the  0's  to  the  left  of  the 
S-curve  are  changed  to  l's  and  all  the  l's  to  the  right  of  the  same 
curve  to  0's  the  result  is  the  S-P  table  shown  in  Table  2  is  called  a 
perfect  S-curve*  the  entries  in  Table  2  are  denoted  Mfj,  Similarly  a 
perfect  P-curve  will  be  obtained  and  the  entries  in  the  new  table  are 
denoted  by  Mjjj*  As  can  be  seen,  *  yi*  for  all  i  which  corresponds 
to  the  fact  that  the  S-curve  is  unchanged  as  the  result  of  changing  the 

cell  entries  from  y^ j  to  M^j.  The  values  of  the  colunn  suns  for  Tables 
1  and  2 »  i.e*,  y,j  and  M^j  are  not  in  general  equal 9  however. 

Sato  (1975)  defined  a  Caution  Index  for  subject  1  by  taking  the  ratio 
of  two  covariances*  The  numerator  of  the  ratio  is  the  covariance  of 
observed  row  vector  i,  (yij)  and  the  sut-of-column  vector, 

(y.j)>  Jal)2|««*»a  and  the  denominator  is  the  covariance  of  the 
corresponding  scores  (assuming  S-curve  is  perfect)  (mJj),  j»l,..*,n  and 
the  column-sum  vector  (y.j),  j“l ,2, ...,n.  More  specifically,  the 
caution  index  C±  for  the  subject  1  is  given  by 


Ci  -  1 


ji^yij  "  pi*)<y*j  *  P**^ 


Table  2 

Perfect  S-curve  Obtained  by  Changing  lfs  to  the  Right 
of  S-curve  to  0  and  0* a  to  the  Left  to  1. 
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and  the  caution  index,  Cj,  for  item  j  is  given  by 


2  (yij  -  P.jXyi.  -  P..) 

Cj  -  1  -  - 

JjCMij  -  P.jXyi.  -  p..) 

The  second  term  of  the  caution  index  for  item  j  is  the  ratio  of  two 
covariances:  The  numerator  is  the  covariance  of  column  vector  j,  (yij) 
and  (yi#),  i-l,.*.,N  and  the  denominator  is  the  covariance  of  the  vectors 
(yi#)  and  i*l,2**.,N.  The  value  of  the  denominator  is  considered 

as  a  norm  value  to  standardize  the  numerator* 

It  can  be  said  that  this  ratio  in  the  above  caution  index  is  equal 
to  the  ratio  of  the  traditional  discriminating  index,  rj,  total-item 
correlation  to  the  standardized  (or  ideal  in  a  sense  illustrated  in 
Table  2)  discriminating  index,  rj*f  for  item  j.  That  is 

covj(yj4 ,  yi.) 


covj(Mfj,  yi.)  covj(Mfj,  yt)  rj' 

oj*Pj  >a  (yi.) 

2  p  2 

It  is  clear  that  2  (y. .  -  P  .)  “  2  (M, .  -  P  .)  because  the  number  of 

1  j  •  J  J  •  J 

1*8  in  column  j  is  invariant  as  can  be  seen  in  Tables  1  and  2,  so  the 
number  of  1*8  in  the  column  vector  j,  (Mfj)  and  (yij)  are  the  same* 
Therefore,  the  two  variances  (y^)  and  0^  <<j>  are  equal* 

The  Extended  Caution  Index  In  Conjunction  With  Response  Theo 


Test  and  Group  Response  Curves:  One  Parameter  Logistic  Model. 


According  to  the  one  parameter  logistic  model,  the  item  response 
curve  may  be  written 

«  - - 9  j“l»2,...,n  , 

PbJ^;  l+exp[-D(6-bj)] 


where  8  is  the  latent  ability ,  bj  is  the  difficulty  of  item  j  and  D  is 
a  constant  which  is  set  equal  to  -1*7  for  convenience  of  comparison  to 
the  normal  ogive  model  (see  Lord  &  Novick,  1968,  p#400).  In  the  above 
equation,  bj  is  fixed  and  8  is  a  random  variable# 

Although  in  practice,  the  number  of  items,  n,  is  a  finite  number, 
it  i 8  useful  to  consider  b  as  a  continuous  variable#  By  holding  8^  fixed 
and  treating  b  as  a  continuous  variable,  the  dual  function,  S0^(b),  of 
the  one  parameter  logistic  function  may  be  defined, 

s8i(b)  -  -i^-DCet-b)]  •  1"1,2,"‘,N  * 

Of  course,  the  expression 

_ 1 _ 

l+exp[-D(8i-bj)] 

may  be  considered  to  be  a  functioa  of  either  8  or  b#  By  choice  of  which 
variable  is  fixed,  the  function  may  be  used  to  define  either  the  item 
response  curve,  Pbj(8)  or  the  person  response  curve  Sg^b)  [see  Lumsden, 
1978,  Weiss  (1977)].  Hence,  the  variable  described  within  the 
parenthesis  of  the  function  is  considered  as  a  random  variable  and  the 
subscript  variable  is  a  fixed  variable. 

The  curves  for  the  pair  of  functions,  Pbj(8)  and  Se^(b)  are 
symmetric  about  the  vertical  axis  at  8  •  0O  (or  equivalently  b  -  bQ) 
provided  8C  •  b0.  As  Illustrated  in  Figure  1,  however,  the  item 
response  curve  (IRC)  and  the  person  response  curve  (PRC),  intersect  at 

<eG  +  bc)/2  if  60  +  b0. 


Insert  Figure  1  about  here 
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Addition  and  subtraction,  an  Inner  product  of  two  functions  In  the 
same  family  (l.e.  in  one  of  the  two  families  Pb2(©) * • • -pbn(e)} 

or  (Sflj(b) , • . *S^(b)}  In  this  paper),  the  norm  of  a  function  and  the 
distance  of  any  two  functions  in  the  same  family  will  be  defined  below* 
Definition*  Addition  and  subtraction  of  two  functions,  Pbj(G)  and 
Pb2(©)>  or  S0j(b),  and  S^(b)  is  defined  as  pairwise  addition  or 
subtraction  of  the  two*  That  is, 

(pbl  ±  Pb2>(e>  =  p^W  *  Pb2  (0> 

and  (Sgj  ±  S02)(b)  5  Sgj(b)  *  Sq2  (b) 

Definition.  An  inner  product  (or  the  sum  of  the  cross  products) 

of  the  two  functions  is  the  sum  of  pairwise  products  Pb^(%)pb2^i^ 

[or  equivalently  S0j(bj)S02(bj)]  or  more  generally,  the  integration 

of  the  product  of  the  two  functions  with  respect  to  0  (or  b).  Thus 

N 

[Pb^*).  pb2(e>J  “  z  *biW*b2W 

i-1 

or  -  /  Pb^8)  Pb2(e)d^ 

and  [S0j(b),  S02(b)]  -  £  Sgj(b j)s02(bj) 

j-1 

or  -  /  Sei(b)S02(b)db  . 

Definition*  The  squared  norms  of  functions  Pb(0)  and  Sg(b)  are  given  by  the 
inner  product  of  themselves*  Thus,  we  have 

iiPbii2  -  [Pb(«>*  pb(®>i 

■  ^2^Pb^(®i)  or  /Pb^(®)d®  , 

and  l|Se||2  ■  [Sg(b),  Sg(b)] 

-  JiSe(bj)  or  /  Se2(b)db. 

Definition.  The  squared  distance  of  two  functions  Pb^(9)  and  Pb20) 

[or  Sgj (b>  and  Sg2(b) ]  is  the  inner  product  of  their  difference. 
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That  Is 

I l^bj  -  Pb2l'2  " 

-  (PbjCo)  -  pb2(e),  pbl(e)  -  Pb2(e>] 

-  UPbjII2  +  HPb2n2  -  2(pbl  ,  Pb2) 

and  1 1 Sei  -  Se2 1  1 2  - 

-  fs0l(b>  -  sei(b),  sei(b)  -  Se2(b)] 

-  l  isei 1 12  +  1 1 Se2 1 1 2  -  2(sei, 

By  using  the  notation  of  Integration, 

NPbx  *  pb2 1 ! 2  -  f  (pbl(e)  -  Pb2(e)i2  de 
or  Iisei  -  S©2 1 1 2  -/  [S01(b)  -  S02(b)]2  db. 

With  these  definitions,  we  are  ready  to  introduce  the  dual  concept 
of  Test  Response  Curve  (Lord,  1980;  Lord  and  Novick,  1968).  This  is  the 
Group  Response  Curve  as  an  average  function  of  N  different  Person 
Response  Curves.  The  Test  Response  Curve  (TRC)  is  an  average  function 
of  n  IRC* s  defined  as 

T(6)  -  (1/n)  2  Pbi(9). 

j»l  J 

Similarly,  the  Group  Response  Curve  (GRC)  is  an  average  function  of  N 
PRC's,  that  is, 

G(b)  -  (l/NjJ^Se^b). 

Illustrative  PRC's  and  IRC's  for  100  hypothetical  persons  were 
generated  by  randomly  sampling  100  values  of  6  from  a  unit  normal 
distribution.  The  resulting  TRC  for  the  simulated  100  item  test  is 
shown  as  the  monotonlcally  increasing  function  in  Figure  2* 

Insert  Figure  2  about  here 

The  curve  that  is  a  monotonlcally  decreasing  function  is  the  PRC  of  0  -  0, 
denoted  by  S0(b).  The  curve  represented  by  'Vs  is  a  Group  Response 
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Curve  which  Is  obtained  by  taking  the  pointwise  mean  of  100  PRC's  over 
the  randomly  generated  100  b  values.  That  is, 

G(b)  "  ife  ^se1(b) 

As  the  number  of  b  values  approaches  infinity,  then  G(b)  in  the  figure 
will  be  a  smooth  curve,  monotonically  decreasing  and  moreover,  if  the 
number  of  6  values  is  also  very  large  then  G(b)  will  be  a  symmetric 
curve  of  T(0)  about  the  vertical  line  of  0  *  b  -  0.  With  this  figure, 
i-l,2,...100  and  bj,  j«l,2,...100  are  randomly  chosen  from  N(0,1)  so 
their  means  are  not  exactly  zero.  It  can  be  shown  numerically  that  T(8) 
and  G(b)  reach  1/2  at  IF  -  Yqo  |  ®i  anc*  ^  “  Too  j  respectively. 

Let  us  denote  the  average  of  T(8f),  i  -  1,...,N  by  T, 

T  -  (1/N)  5  T(e±) 
i-1 

and  the  average  of  G(bj),  j-l,.*.,n  by  G, 


Then  T  »  G,  because 


G  -  (l/n)J^G(bj) 


T  -  (1/N)  5  T(6i) 

1-1  1 

-  (1/nN)  2  2  { l/l+exp[-D(8^-bj)] } 

-  (1/n)  2  GOm)  -  G 

J-l  J 

Definition  of  Various  Extended  Caution  Indices 

Sato's  (1975)  S -curve  may  be  viewed  as  a  discrete  test  response 
curve.  The  perfect  S -curve  divides  l's  and  0's  into  two  mutually 
exclusive  areas  with  l's  under  the  curve  and  0's  above  it.  Note, 
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however,  that  direct  correspondence  in  this  way  Involves  a  reordering  of 
the  subjects  from  low  to  high  rather  than  from  high  to  low  as  typically 
presented  by  Sato  and  as  was  shown  In  Table  2.  represents  the  average 
probability  of  correctly  answering  items  on  the  test  when  a  person's 
ability  is  equal  to  0,  The  analogy  between  the  S-curve  and  a  TRC  may  be 
seen  by  considering  an  alternative  H  by  n  score  matrix  with  real  numbers 
based  on  IRT  rather  than  binary  scores.  More  specif icaly,  let 

PMij  -  pbj(ei) 

where  0*  is  an  estimated  ability  parameter,  0,  for  person  1  and  bj 
estimated  Item  parameter  for  Item  j  under  the  condition  that 


jiV*1 


n 

2 

J-l 


Since  pbj(%)  -  s0t(bj) 

for  fixed  1  and  j,  the  cells  of  the  probability  matrix  (PM^j)  are  also 
equal  to  Sg^(bj).  If  the  rows  and  columns  of  this  matrix  are  arranged 
in  the  manner  of  the  S-P  table  and  columnwise  sums  of  the  cell  entries 
are  obtained,  the  result  is  N  times  6(bj),  which  corresponds  to  the  P- 
curve.  Similarly,  n  times  T(0^)  corresponding  to  the  S-curve  may  be 
obtained  by  summing  the  cell  entries  for  each  row. 

Selected  rows  and  columns  of  a  probability  matrix  (FM^j)  are 
illustrated  in  Table  3  for  a  32  item  test  involving  the  subtraction  of 
signed  numbers  that  was  administered  to  a  sample  of  127  students 
(Tatsuoka  &  Tatsuoka,  1981).  Also  shown  in  Table  3  are  the  values  of 
the  estimated  item  and  ability  parameters  and  the  test  and  group 
response  curves  evaluated  at  those  estimated  parameter  values  (i.e., 
T(0i)  and  G(bj)  respectively). 
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Insert  Table  3  about  here 

Before  Introducing  the  extended  caution  index ,  it  is  useful  to 
compare  the  S  and  P  curves  for  the  data  from  which  the  estimates  in 
Table  3  were  obtained  with  their  counterparts,  i.e.,  n  times  T(6^)  and  N 
times  G(b^).  The  tiro  comparisons  $  v/ith  nT(0^)  and  P  with  NG(bj)  are 
provided  in  Figures  3  and  4  respectively#  The  tic  marks  on  the 
horizontal  axis  in  Figure  3  indie**  the  location  of  the  0's  for  the  127 
students  in  the  study.  The  tic  marks  in  Figure  4  show  the  values  of  bj 
for  the  32  items.  The  close  correspondence  between  the  two  pairs  of 
curves  is  apparent.  The  number  of  items  and  the  limited  range  of  values 

A 

that  bj  assumes  for  these  data  obviously  limits  the  evaluation  of  the 
correspondence  between  the  curves  in  Figure  4,  however. 

Insert  Figures  3  &  4  about  here 

Given  the  parallels  between  the  S-P  curves  and  the  GRC  and  TRC,  the 
extension  of  the  caution  index  for  use  with  the  latter  curves  is 
relatively  straightforward.  There  are,  however,  several  natural  ways 
in  which  the  extension  can  be  made.  Possibly  the  most  obvious  extension 
is  to  simply  replace  the  term  (M*j  -  Pi#)  in  the  denominator  of 
equation  (1)  by  its  counterpart  from  the  PH^j  matrix,  i.e., 

[PMij  -  T(0i)]  -  [Sfcjtfi)  -  T(0i)]. 

With  the  above  substitution,  our  first  extended  caution  index,  Cl*,  is  defined 

2  (yij  -  -  *..) 

Cl1  "  1  *  stse^bj)  -  Ttf^Ky.j  -  p..) 


The  numerator  divided  by  n,  i.e.,  the  covariance  of  (yij)  and  (y#j), 
can  be  expanded  to  the  sum  of 

(1/n>  and  "p**Pi-  • 

The  value  of  the  second  term  does  not  depend  on  a  person's  response  to 
each  item  but  depends  on  his/her  total  score.  As  long  as  the  total 
score  is  fixed ,  the  anomaly  of  response  patterns  will  not  be  detected  by 
this  value.  This  value  varies  between  persons,  so  if  two  persons  have 
the  same  achievement  level  0^,  then  the  judgment  regarding  the  extent 
to  which  each  response  pattern  deviates  from  the  norm  depends  only  on 
the  first  term  of  the  numerator.  Since  the  denominator  is  a  normalized 
constant  for  a  fixed  value,  0*,  it  is  unlikely  that  a  particular 
aberrant  response  pattern  produced  by  an  individual  whose  achievement 
level  is  0*  will  affect  the  denominator. 

Thus,  it  is  natural  to  expect  that  if  both  the  quantities  are  replaced 
by  the  inner  products  of  the  two  row  vectors  (yij)  and  (y#j)  for  j«l, 
2,...,  n,  the  values  of  Cl^  will  be  affected  by  the  degree  of  anomaly  of 
individual  response  patterns.  Moreover,  calculation  of  inner  products 
is  easier  than  that  of  covariances.  Let  us  define  four  other  natural 
extentlons  of  the  Caution  index  as  follows. 

Definition.  Four  alternative  definitions  of  the  extended  caution 
index  for  person  1  are: 

j,yijy.j 

C2i  -  1  -  -  » 

J1sei(fij)y.j 


C3t  -  1  - 


j2iG(6J)8«i(bj) 


two  categories*  The  indices  in  the  first  category,  C2^  and  C4^  give 
measures  that  are  more  group  dependent,  because  they  are  the  sums  of 
cross  products  of  the  corresponding  elements  of  the  observed  vector 
(yij)  and  the  row-sum  total  vector  (y.j)»  and  Group  Response  Curve  G(bj) 
respectively.  They  measure  the  relationship  of  an  observed  response 
pattern  for  a  person  i  to  a  normed  variable  derived  from  the  group  the 
person  i  belongs  to.  Thus  these  indices  have  a  similiar  function  to  the 
Norm  Conformity  Index,  NCI,  defined  in  Tatsuoka  &  Tatsuoka  (1980).  The 
remaining  indices,  C3^  and  C5^,  are  more  individually  oriented.  That 
means  the  quantities  obtained  from  C3f  and  CSj  reflect  the  extent  a 
person  i's  response  pattern  (yij)  relates  to  a  theoretically  derived  PRC 
at  the  fixed  level  of  0^.  Thus,  it  can  be  said  that  the  indices  C3*  and 
C5i  are  similar  to  the  Individual  Consistency  Index  (Tatsuoka  & 

Tatsuoka,  1980). 


These  extended  caution  Indices  for  person  1  will  be  easily  altered 


.J.wj  «. 

C2j-  1  -V — : — 


to  those  for  item  j 
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and 


C3j 


2  yij  Pb.^i) 

1-1  J 

2  PB  (6i)T(ei) 
-J 


C4J 


N 

2  Yij  T(6i) 

1-1  J _ 

2  PB  (ei)T(6i) 
1*1  J 


C5i 


jyij  PBj(ei) 

N  ,«  v 
j^Tlj  T(«i) 


» 


Similarly,  the  indices  C3j  and  C5j  are  potentially  useful  for  detecting 
anomalous  response  patterns  in  comparison  with  item  j's  IRC  while  C2j 
and  C4j  are  potentially  useful  indices  for  purposes  of  identifying  items 
of  which  patterns  deviate  from  that  of  test,  TRC. 

The  Case  of  Two  and  Three  Parameter  Logistic  Models 
Problems  in  Person  Response  Curves  and  Group  Response  Curves 

Person  Response  Curves  for  the  one  parameter  logistic  model  are 
represented  by  smooth  monotonically  decreasing  functions  defined  over 
the  difficulties  of  the  infinitely  many  items.  But  PRC  for  the  two 
parameter  logistic  model  is  no  longer  a  smooth,  monotonically  decreasing 
curve.  Figure  5  provides  the  graph  of  Person  Response  Curve  for  the 
ability  levels  of  0  -  0  as  well  as  Test  Response  Curve  of  the  two 
parameter  logistic  model  where  ability  measures  0^,  1*1 ,2, • • • , 100,  were 
randomly  sampled  from  a  normal  (0,1)  distribution,  the  difficulties  bj, 
j-l,2,...,100  were  also  randomly  sampled  from  a  normal  (0,1) 
distribution  and  the  item  discrimination  indices,  aj  j*l,...,100,  were 
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drawn  from  the  uniform  distribution  of  the  interval  (0*8,  L).  Test 
Response  Curve,  Person  Response  Curves  are  given  by 

T(0)  -  ( 1/n)  2  Pb.(9) 

j*i  J 

and 

0°^  ^  1+exp [-Da(0o-b)l 

for  a  fixed  0O  and  variable  b 

Insert  Figure  5  about  here 

The  dotted  line  (-H-+)  in  the  figure  is  the  Group  Response  Curve  of  a 
hundred  subjects.  Although  each  PRC  is  locally  oscillated,  especially 
around  the  origin,  the  GRC  (the  mean  curve  of  these  PRCs)  becomes  fairly 
smooth  and  almost  monotonically  decreasing.  Since  bj,  j-l,...100  are 
randomly  selected  from  N(0,1),  a  larger  oscillation  of  PRC  around  the 
mean  0  is  expected.  But  GRC  is  expected  to  be  smoother  as  the  number  yt 
students  and  items  increase  to  a  larger  number. 

Insert  Figure  6  about  here 

Figure  6  is  the  graph  of  TRC,  GRC,  PRC  of  0  *  0  for  the 
three  parameter  logistic  model.  The  parameters  0^,  bj  and  aj  were 
generated  by  the  same  method  as  that  of  the  two  parameter  model  then 
fifty  C-values  of  0.15,  and  50  of  0.20  were  randomly  assigned  to 
100  pairs  of  aj  and  bj  to  make  the  three  parameter  logistic  model. 

It  seems  that  the  smoothness  of  the  curve  GRC  for  three  parameter 
logistic  model  is  about  the  same,  differing  only  as  expected  in  terms  of 
the  lower  asymptote.  A  larger  number  of  subjects  will  be  needed  for  the 
three  parameter  case  in  order  to  obtain  smoother  GRC. 

The  definition  of  the  extended  caution  Indices  may  be  applied  more 
generally  to  the  two  and  three  parameter  logistic  models  in  essentially 


snse  Curve  ( +  +  line )  and 
Parameter  Logistic. 
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the  same  manner  as  it  was  develoed  for  the  one  parameter  model. 

Note  that  the  arrangement  of  rows  and  columns  according  to  the 
orders  of  the  proportion,  corrects  (p  values)  for  n  items  and  the  total 
scores  for  N  subjects  is  essential  to  determine  S-P  curves,  and  the 
values  of  MP  and  M8,  1-1, 2, ...,N,  j-l,.*.,n.  With  our  extended  caution 
indices,  the  arrangements  of  rows  and  columns  in  monotonic  order  of  the 
probability  are  no  longer  necessary. 

Application  of  New  Indices  for  the  Detection  of  Anomalous  Responses. 

There  is  evidence  that  student  errors  on  certain  types  of  arithmetic 
problems  are  frequently  quite  systematic  (Brown  and  Burton,  1978; 

Blrenbaum  and  Tatsuoka,  1980  Davis,  McKnight,  1980).  That  is,  students 
seem  to  consistently  apply  erroneous  algorithms  in  attempting  to  answer 
a  problem  of  a  particular  form.  Sometimes  erroneous  or  incomplete  rules 
result  in  the  right  answer.  For  example,  a  student  who  consistently 
treats  a  multiplication  sign  as  if  it  were  an  addition  sign  would  get 
the  right  answer  to  the  problem  2x2*4,  but  would  get  it  for  the 
wrong  reason.  A  score  of  zero  for  using  the  wrong  operation  would  be  a 
better  reflection  of  the  student's  ability  to  multiply  than  a  score  of 
one  for  answering  "4"  to  the  item. 

Blrenbaum  and  Tatsuoka  (1980)  have  demonstrated  that  the  customary  zero- 
one  scoring  of  incorrect  and  corrent  answers  can  give  the  appearance  of 
higher  dimensionality  and  cause  difficulty  in  attempting  to  apply  IRT  when 
students  consistently  apply  erroneous  rules  to  the  addition  and  subtraction 
of  signed  numbers.  The  difficulties  result  from  the  fact  that  several  erroneous 
rules  frequently  yield  the  right  answer  for  some  problems.  Right  answers 
for  the  wrong  reasons  not  only  cause  problems  in  applying  IRT,  but  more 
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importantly  they  can  result  in  misleading  scores  and  make  it  difficult  to 
diagnose  what  a  student  is  doing  wrong. 

By  painstaking  work  Tatsuoka  and  her  colleagues  (Blrenbaum  and 
Tatsuoka,  1980;  Blrenbaum,  1981)  were  able  to  identify  several  erroneous 
rules  that  were  consistently  applied  by  certain  students.  Blrenbaum  and 
Tatsuoka  (1980)  reanalyzed  their  data  after  converting  ones  to  zeroes 
for  Items  that  students  got  right  for  the  wrong  reasons.  That  is,  an 
item  score  was  changed  from  one  to  zero  if  (1),  a  student  was  identified 
as  consistently  applying  an  erroneous  rule  and  (2)  application  of  that 
erroneous  rule  would  lead  to  the  correct  answer  for  the  particular  Item 
in  question.  Analysis  of  the  resulting  modified  data  indicated  that  the 
data  were  more  nearly  unidimensional  and  there  was  good  evidence  that 
IRT  was  more  applicable  to  the  modified  data  than  to  the  original  data. 

Anomalous  response  patterns  can  sometimes  be  found  by  conducting  an 
intuitive  error  analysis  or  by  clinical  interviews.  Both  approaches 
require  enormous  effort.  Brown  and  Burton  (1978)  and  Tatsuoka  et  al. 
(1980)  have  developed  cumputerized  approaches  to  error  analysis.  But 
these  methods  are  expensive  and  were  based  on  extensive  work  with  highly 
specific  item  content. 

Tatsuoka  and  Tatsuoka  (1981)  demonstrated  an  index,  called  the 
individualized  consistency  index  (IC1)  which  was  shown  to  be  useful  in 
detecting  a  variety  of  erroneous  rules  of  operation  of  signed-number 
addition  and  subtraction  problems.  Using  the  ICI  to  detect  examinees 
who  are  apt  to  have  a  misconception  saves  considerable  effort  because 
only  examinees  so  Identified  have  their  item  responses  routed  to  the 
detailed  error-diagnostic  system.  Application  of  the  ICI  is  limited, 
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however,  because  it  requires  repeated  measures,  i.e*,  several  items 
based  on  an  identical  item  form,  within  the  test.  Such  repetition  is 
not  common  on  most  tests. 

As  will  be  seen  below,  the  index  similar  to  ICI,  C3^,  not  only 
avoids  the  repeated  measure  limitation  but  is  apparently  more  effective 
for  purposes  of  detecting  anomalous  response  patterns  resulting  from  the 
consistent  application  of  an  erroneous  rule.  Tatsuoka  &  Tatsuoka 
(1981)  showed  a  list  of  erroneous  rules  of  operation  ("bugs")  detected 
by  ICI.  The  32  response  patterns  resulting  from  these  bugs  are 
classified  in  Group  A.  The  rest  of  the  103  response  patterns  are 
classified  into  two  groups  according  to  the  error-diagnostic  system,  SIGNBUG. 
Group  B  consists  of  7  responses  which  are  probably  using  one  or  two 
erroneous  rules  inconsistently;  Group  C,  responding  adequately  using  the 
right  rule  of  operation  and/or  no  indication  of  systematic  errors.  The 
errors  observed  in  Group  C  are  apparently  just  random  errors.  The 
estimated  item  and  person  ability  parameters  needed  to  compute  the 
extended  caution  Indices  were  obtained  by  the  computer  program  GETAB 
(Robert  Balllle,  1979),  using  Birenbaum  &  Tatsuoka's  modified  dataset. 

Distributions  of  the  Indices  C2^  snd  C3*  are  displayed  In  Figures  6 
and  7  respectively.  Only  members  of  groups  A  and  B  (persons  who 

consistently  used  an  erroneous  rule)  and  of  group  C  (persons  who  made  a 

substantial  number  of  errors  but  whose  errors  were  not  the  result  of 
consistent  use  of  an  erroneous  rule)  are  included  in  the  distributions 
shown  in  Figures  6  and  7.  In  both  figures,  persons  in  group  A  and  B  are 

depicted  by  shaded  boxes  and  those  in  group  C  by  unshaded  boxes. 
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Insert  Figures  7  and  3  about  here 

As  can  be  seen  in  Figure  6,  C2*  does  not  provide  any  basis  for 
distinguishing  persons  who  are  consistently  using  an  erroneous  rule  from 
those  who  aren't.  The  two  groups  are  distinguished  almost  perfectly, 
however,  by  the  magnitude  of  C3i  (see  Figure  7).  Indeed,  there  is 
almost  no  overlap  between  the  two  groups.  All  39  members  of  Groups  A  and  B 
have  values  of  C3*  of  .05  or  higher  whereas  only  two  of  the  88  members 
of  group  C  have  positive  values  of  C3i  and  the  rest  of  the  members  of 
group  C  have  values  of  C3*  as  large  as  .05.  Thus,  03^  may  be  used  to 
identify  with  a  high  degree  of  accuracy  those  persons  who  consistently 
use  an  erroneous  rule. 

As  might  be  expected  from  a  comparison  of  the  coefficients,  C4* 
works  in  a  fashion  quite  similar  to  C2*,  and  C5*  works  much  like  C3*  in 
terms  of  the  abllly  of  these  indices  to  distinguish  members  of  groups  A, 

B  and  C*  It  is  clear  that  C2*  and  C4*  are  not  useful  for  detecting 
anomalous  response  patterns  resulting  from  consistent  application  of  an 
erroneous  rule.  These  indices  may  be  useful  for  other  tasks  for  which 
NCI  or  Van  de  Flier's  index  (Harnisch  &  Linn,  1981)  have  been  found  to 
be  useful.  The  third  and  fifth  indices  (C3*  and  C5i)  however,  are  quite 
effective  for  purposes  of  detecting  persons  who  make  consistent  errors. 

Insert  Table  4  about  here 

Table  4  shows  a  summary  of  t-statlstics  comparing  the  means  on  the 
four  generalised  caution  Indices  and  ICI  in  the  two  groups;  A  and  B 
combined  versus  C  by  itself.  The  t-value  for  index  2  is  not  significant 
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Table  4 

A  S  usury  of  t-statistlcs  Coopering  the  Naans  on  the 
Four  Generalized  Caution  Indices  aud  ICI  in  the  Two  Groups 


Indices 

Group  A  &  B 

Group  C 

t- value 

P 

N 

39 

88 

Index  2 

Mean 

-.0170 

-.0065 

S.D. 

.0929 

.0306 

.689 

.4980 

Index  3 

Mean 

.5310 

-.2688 

S.D. 

.2444 

.1300 

-19.293 

< .00005 

Index  4 

Mean 

.0650 

-.0045 

SeDe 

.1237 

.0293 

-3.466 

< .0015 

Index  5 

Mean 

.5091 

-.2643 

S.D. 

.2615 

.1350 

-17.467 

< .00005 

ICI 

Mean 

.9223 

.8144 

S.D. 

.0645 

.1058 

-7.121 

< .00005 
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but  all  others  are  significant*  Index  1  is  excluded  in  the  analysis 
because  the  denominator  of  this  index  becomes  infinity  when  all  items 
are  correctly  answered  by  all  examinees* 

Discussion 

As  was  shown  above,  the  caution  index  which  Sato  developed  based 
solely  on  a  comparison  of  observed  item  responses  to  group  responses  can  be 
readily  extended  to  theory  based  estimates  of  person  and  group  response 
probabilities*  The  caution  Index  is  a  linear  transformation  of  the 
covariance  of  a  person's  response  pattern  with  one  or  another 
theoretical  curves  computed  using  item-response  theory.  Alternatively, 
the  extended  caution  indices  may  be  viewed  as  linear  transformations  of 
the  distance  bewteen  a  person's  response  pattern  and  a  theoretical  curve 
(either  the  person  response  curve,  as  in  the  case  of  C3*  and  C5i  or  the 
group  response  curve,  as  in  the  case  of  C4^). 

The  application  of  the  extended  caution  indices  that  were 
introduced  in  this  paper  provided  strong  evidence  that  the  indices  that 
depend  on  the  distance  between  a  person's  response  pattern  and  their 
theoretical  person  response  curve  (i*e*,  C3*  and  C5± )  are  quite 
effective  for  purposes  of  identifying  persons  who  consistently  use  an 
erroneous  rule  in  answering  signed-number  arithmetic  problems*  This  is 
a  potentially  important  result  that  deserves  further  investigation  with 
other  data  sets  involving  different  types  of  achievement  test  data*  If 
additional  research  yields  similar  results,  these  indices  may  have 
considerable  instructional  utility  because  instruction  can  be  made  much 
more  specific  once  it  is  determined  that  a  student  is  consistently 
making  an  error  as  the  result  of  a  particular  misconception* 
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